1. AUTOGRAD: AUTOMATIC DIFFERENTIATION:

Central to all neural networks in PyTorch is the autograd package. Let’s first briefly visit this, and we will then go to training our first neural network.

The autograd package provides automatic differentiation for all operations on Tensors. It is a define-by- run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.

1.1. Create a tensor and set requires_grad=True to track computation with it:

torch.Tensor is the central class of the package. If you set its attribute .requires_grad as True, it starts to track all operations on it. When you finish your computation you can call .backward() and have all the gradients computed automatically. The gradient for this tensor will be accumulated into .grad attribute.

define a function:
x = [[1,1],[1,1]]
y = x + 2
z = y ^ 2 * 3
out = z.mean()



In [1]:

    
import torch

# create a tensor with setting its .requires_grad as Ture
x = torch.ones(2, 2, requires_grad=True)
print(x)

x1 = torch.ones(2,2,requires_grad=False)
# x1.requires_grad_(True)
print(x1)









    



tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
tensor([[1., 1.],
        [1., 1.]])

1.2. Do a tensor operation:



In [2]:

    
y = x + 2
print(y)

y1 = x1 + 2
print(y1)









    



tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)
tensor([[3., 3.],
        [3., 3.]])

y was created as a result of an operation, so it has a grad_fn.But y1 not



In [3]:

    
print(y.grad_fn)
print(y1.grad_fn)









    



<AddBackward0 object at 0x0000022448FFFDD8>
None

1.3. Do more operations on y



In [4]:

    
z = y * y * 3
z1 = y1 * y1 * 3
out = z.mean()   #calculate z average value
out1 = z1.mean()   #calculate z1 average value

print(z, out)
print(z1, out1)









    



tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward1>)
tensor([[27., 27.],
        [27., 27.]]) tensor(27.)

.requiresgrad( ) changes an existing Tensor’s requires_grad flag in-place. The input flag defaults to False if not given.

Tensor and Function are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each tensor has a .grad_fn attribute that references a Function that has created the Tensor (except for Tensors created by the user - their grad_fn is None).



In [5]:

    
a = torch.randn(2, 2)    # a is created by user, its .grad_fn is None
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)   # change the attribute .grad_fn of a
print(a.requires_grad)
b = (a * a).sum()        # add all elements of a  to b
print(b.grad_fn)









    



False
True
<SumBackward0 object at 0x000002244B80EE48>

2. Gradients:

2.1. Let’s backprop now.

Because out contains a single scalar, out.backward( ) is equivalent to out.backward(torch.tensor(1.))



In [6]:

    
out.backward()
# out.backward(torch.tensor(1.))
# out1.backward()

you can get parameters gradient like below:



In [7]:

    
x_grad = x.grad
y_grad = y.grad
z_grad = z.grad
print(x_grad)
print(y_grad)
print(z_grad)









    



tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])
None
None

2.2. Now let’s take a look at an example of Jacobian-vector product:

If you want to compute the derivatives, you can call .backward( ) on a Tensor. If Tensor is a scalar (i.e. it holds a one element data), you don’t need to specify any arguments to backward( ), however if it has more elements, you need to specify a gradient argument that is a tensor of matching shape.

define a function:
x = [1, 1, 1]
y = x + [1, 2, 3]
z = y ^ 3



In [8]:

    
x = torch.ones(3, requires_grad=True)
y = x + torch.tensor([1., 2., 3.])
z = y * y * y
print(z)

v = torch.tensor([1, 0.1, 0.01])
# z is a vector, so you need to specify a gradient whose size is the same as z
z.backward(v)    
print(x.grad)









    



tensor([ 8., 27., 64.], grad_fn=<MulBackward0>)
tensor([12.0000,  2.7000,  0.4800])

问题1：

传入 .backward()里面的tensor是什么？请尝试不同的输入并回答。

当variables是标量的时候，.backward()传递的参数grad_variables可省略。该参数实际上可表示为下面代码中的z和x的函数关系的系数，默认为1.0。

下面代码对于$y=x^2+2x+4$进行求导，可以观察到指定权重为2时，梯度是原来的两倍，即对$y=2x^2+4x+8$求导。



In [9]:

    
x = torch.ones(1, requires_grad=True)
y = x * x + 2 * x + 4
y.backward(retain_graph=True)
print(x.grad)

x.grad.data.zero_()

y.backward(torch.tensor([2.0]), retain_graph=True)
print(x.grad)









    



tensor([4.])
tensor([8.])

当variables是张量的时候，.backward()传递的参数grad_variables不可省略，且张量大小必须与variables一致。该参数也是系数，表示x各分量的权重。



In [10]:

    
x = torch.ones(2, requires_grad=True)
t = x + torch.tensor([1., 2.])
y = t * t + 2 * t + 4
y.backward(torch.tensor([1., 0.1]), retain_graph=True)
print(x.grad)









    



tensor([6.0000, 0.8000])

3. NEURAL NETWORKS

A typical training procedure for a neural network is as follows:

Define the neural network that has some learnable parameters (or weights)
Iterate over a dataset of inputs
Process input through the network
Empty the parameters in optimizer
Compute the loss (how far is the output from being correct)
Propagate gradients back into the network’s parameters
Update the weights of the network, typically using a simple update rule: weight = weight - learning_rate * gradient

3.1. Define the network

Let’s define a network to classify points of gaussian distribution to three class

3.1.1. Show all points

Show all points(containing trainset and testset) you will use



In [11]:

    
# show all points, you can skip this cell
def show_original_points():
    label_csv = open('./labels/label.csv', 'r')
    label_writer = csv.reader(label_csv)
    class1_point = []
    class2_point = []
    class3_point = []
    for item in label_writer:
        if item[2] == '0':
            class1_point.append([item[0], item[1]])
        elif item[2] == '1':
            class2_point.append([item[0], item[1]])
        else:
            class3_point.append([item[0], item[1]])
    data1 = np.array(class1_point, dtype=float)
    data2 = np.array(class2_point, dtype=float)
    data3 = np.array(class3_point, dtype=float)
    x1, y1 = data1.T
    x2, y2 = data2.T
    x3, y3 = data3.T
    plt.figure()
    plt.scatter(x1, y1, c='b', marker='.')
    plt.scatter(x2, y2, c='r', marker='.')
    plt.scatter(x3, y3, c='g', marker='.')
    plt.axis()
    plt.title('scatter')
    plt.xlabel('x')
    plt.ylabel('y')
    plt.show()

3.1.2. Define a network

When you define a network, your class must to inherit nn.Module, then you should to overload __init__ method and forward method

Network(
(hidden): Linear(in_features=2, out_features=5, bias=True)
(sigmiod): Sigmoid()
(predict): Linear(in_features=5, out_features=3, bias=True)
)



In [12]:

    
import numpy as np
import matplotlib.pyplot as plt
import torchvision
import torch
import pandas as pd
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
import torch.optim as optim
import time
import csv
import numpy as np



In [13]:

    
class Network(nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        '''
        Args:
            n_feature(int): size of input tensor
            n_hidden(int): size of hidden layer 
            n_output(int): size of output tensor
        '''
        super(Network, self).__init__()
        # define a liner layer
        self.hidden = nn.Linear(n_feature, n_hidden)
        # define sigmoid activation 
        self.sigmoid = nn.Sigmoid()
        self.predict = nn.Linear(n_hidden, n_output)

    def forward(self, x):
        '''
        x(tensor): inputs of the network
        '''
        # hidden layer
        h1 = self.hidden(x)
        # activate function
        h2 = self.sigmoid(h1)
        # output layer
        out = self.predict(h2)
        '''
        Linear classifier often follows softmax to output probability,
        however the loss function CrossEntropy we used have done this 
        operation, so we don't use softmax function here.
        '''
        return out

CrossEntropy written in pytorch: https://pytorch.org/docs/stable/nn.html?highlight=crossentropy#torch.nn.CrossEntropyLoss

3.1.3. Overload a Dataset

Please skip the below cell when you are trying to train a model



In [14]:

    
class PointDataset(Dataset):
    def __init__(self, csv_file, transform=None):
        '''
        Args:
            csv_file(string): path of label file
            transform (callable, optional): Optional transform to be applied
                on a sample.
        '''
        self.frame = pd.read_csv(csv_file, encoding='utf-8', header=None)
        print('csv_file source ---->', csv_file)
        self.transform = transform

    def __len__(self):
        return len(self.frame)

    def __getitem__(self, idx):
        x = self.frame.iloc[idx, 0]
        y = self.frame.iloc[idx, 1]
        point = np.array([x, y])
        label = int(self.frame.iloc[idx, 2])
        if self.transform is not None:
            point = self.transform(point)
        sample = {'point': point, 'label': label}
        return sample

3.1.4. Train function

Train a model and show running_loss curve ana show accuracy curve



In [15]:

    
def train(classifier_net, trainloader, testloader, device, lr, optimizer):
    '''
    Args:
        classifier_net(nn.model): train model
        trainloader(torch.utils.data.DateLoader): train loader
        testloader(torch.utils.data.DateLoader): test loader
        device(torch.device): the evironment your model training
        LR(float): learning rate
    '''
    # loss function
    criterion = nn.CrossEntropyLoss().to(device)
    
    optimizer = optimizer
    
    # save the mean value of loss in an epoch
    running_loss = []
    
    running_accuracy = []
    
    # count loss in an epoch 
    temp_loss = 0.0
    
    # count the iteration number in an epoch
    iteration = 0 

    for epoch in range(epoches):
        
        '''
        adjust learning rate when you are training the model
        '''
        # adjust learning rate
        # if epoch % 100 == 0 and epoch != 0:
        #     LR = LR * 0.1
        #     for param_group in optimizer.param_groups:
        #         param_group['lr'] = LR

        for i, data in enumerate(trainloader):
            point, label = data['point'], data['label']
            point, label = point.to(device).to(torch.float32), label.to(device)
            outputs = classifier_net(point)
            '''# TODO'''
            
            # empty parameters in optimizer
            optimizer.zero_grad()
            
            # calcutate loss value
            loss = criterion(outputs, label)

            # back propagation
            loss.backward()

            # update parameters in optimizer(update weigtht)
            optimizer.step()
            
            '''# TODO END'''
            
            # save loss in a list
            temp_loss += loss.item()
            iteration +=1
            # print loss value 
#             print('[{0:d},{1:5.0f}] loss {2:.5f}'.format(epoch + 1, i, loss.item()))
            #slow down speed of print function
            # time.sleep(0.5)
        running_loss.append(temp_loss / iteration)
        temp_loss = 0
        iteration = 0
        # print('test {}:----------------------------------------------------------------'.format(epoch))
        
        # call test function and return accuracy
        running_accuracy.append(predict(classifier_net, testloader, device))
    
    # show loss curve
    show_running_loss(running_loss)
    
    # show accuracy curve
    show_accuracy(running_accuracy)
    
    return classifier_net

问题2:

根据本节前面提到的训练一个网络的完整流程，将下面的代码按照正确的顺序填写到train()函数里面的 # TODO 里面

# update parameters in optimizer(update weigtht)
optimizer.step()

# calcutate loss value
loss = criterion(outputs, label)

# empty parameters in optimizer
optimizer.zero_grad()

# back propagation
loss.backward()



In [16]:

    
# show running loss curve, you can skip this cell.
def show_running_loss(running_loss):
    # generate x value
    x = np.array([i for i in range(len(running_loss))])
    # generate y value
    y = np.array(running_loss)
    # define a graph
    plt.figure()
    # generate curve
    plt.plot(x, y, c='b')
    # show axis
    plt.axis()
    # define title
    plt.title('loss curve:')
    #define the name of x axis
    plt.xlabel('step')
    plt.ylabel('loss value')
    # show graph
    plt.show()

3.1.5. Test function

Test the performance of your model



In [17]:

    
def predict(classifier_net, testloader, device):
#     correct = [0 for i in range(3)]
#     total = [0 for i in range(3)]
    correct = 0
    total = 0
    
    with torch.no_grad():
        '''
        you can also stop autograd from tracking history on Tensors with .requires_grad=True 
        by wrapping the code block in with torch.no_grad():
        '''
        for data in testloader:
            point, label = data['point'], data['label']
            point, label = point.to(device).to(torch.float32), label.to(device)
            outputs = classifier_net(point)
            '''
            if you want to get probability of the model prediction,
            you can use softmax function here to transform outputs to probability.
            '''
            # transform the prediction to one-hot form
            _, predicted = torch.max(outputs, 1)
            # print('model prediction: ', predicted)
            # print('ground truth:', label, '\n')
            correct += (predicted == label).sum()
            total += label.size(0)
            # print('current correct is:', correct.item())
            # print('current total is:', total)
            
        # print('the accuracy of the model is {0:5f}'.format(correct.item()/total))
        
    return correct.item() / total



In [18]:

    
# show accuracy curve, you can skip this cell.
def show_accuracy(running_accuracy):
    x = np.array([i for i in range(len(running_accuracy))])
    y = np.array(running_accuracy)
    plt.figure()
    plt.plot(x, y, c='b')
    plt.axis()
    plt.title('accuracy curve:')
    plt.xlabel('step')
    plt.ylabel('accuracy value')
    plt.show()

3.1.6. Main function



In [19]:

    
if __name__ == '__main__':
    '''
    change train epoches here
    '''
    # number of training
    epoches = 100
    
    '''
    change learning rate here
    '''
    # learning rate
    # 1e-4 = e^-4
    lr = 1e-3
    
    '''
    change batch size here
    '''
    # batch size
    batch_size = 16
    
    
    
    
    # define a transform to pretreat data
    transform = torch.tensor
    
    # define a gpu device
    device = torch.device('cuda:0')
    
    # define a trainset
    trainset = PointDataset('./labels/train.csv', transform=transform)
    
    # define a trainloader
    trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
    
    # define a testset
    testset = PointDataset('./labels/test.csv', transform=transform)
    
    # define a testloader
    testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
    
    show_original_points()

    # define a network
    classifier_net = Network(2, 5, 3).to(device)   
    
    '''
    change optimizer here
    '''    
    # define a optimizer
    optimizer = optim.SGD(classifier_net.parameters(), lr=lr, momentum=0.9)
    
    # get trained model
    classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv

问题3:

请尝试调整不同大小的学习率，观察loss曲线和accuracy曲线，并阐述学习率对loss值以及accuracy值的影响和原因。

下面调节学习率至原来的一半。

可以观察到学习率越高，损失值下降越快，而准确率提高越快；而学习率下降，损失值下降变得更加缓慢，准确率提高也变得缓慢。当学习率越高时，权重修改的大小就越大，因此损失值和准确率变化就更加快，模型训练很容易就达到了最优解附近。

其次，学习率过高时，会导致损失值和准确率存在波动。由于学习率过高，在训练的过程中权重修改很容易导致左右摆动，在最低点左右迭代，因此产生震荡。学习率过低，会导致模型训练完后还达不到想要的效果。

在本次数据中，上面代码给定的1e-3的学习率过低，导致最终的损失值在0.2~0.4左右，而将学习率调为0.01，不仅损失值下降快，而且最终效果要比前者好一点。



In [20]:

    
epoches = 100
lr = 0.01
batch_size = 16
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.SGD(classifier_net.parameters(), lr=lr, momentum=0.9)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv

问题4:

请尝试调整不同的batch_size, batch_size=1, batch_size=210,batch_size=（1~210）, 并阐述batch_size对loss值以及accuracy值的影响和原因。

Batch的大小影响收敛速度和处理数据速度。

当batch越大时，每一代训练需要处理的时间就越少，因为处理的批次减少。但是当batch增大时，收敛速度也大大减低，并且损失值和准确率收敛的速度也下降。当处理的批次减少时，每一代得到收敛次数就减少了，因此两个值收敛速度减少。

当batch变小时，每一代训练需要处理的时间将变长。但是损失值和准确率收敛速度加快。因为每一代处理的批次增多了，损失值和准确率的收敛就加快了。



In [21]:

    
epoches = 100
lr = 1e-3
batch_size = 1
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.SGD(classifier_net.parameters(), lr=lr, momentum=0.9)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv



In [22]:

    
epoches = 100
lr = 1e-3
batch_size = 30
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.SGD(classifier_net.parameters(), lr=lr, momentum=0.9)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv



In [23]:

    
epoches = 100
lr = 1e-3
batch_size = 210
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.SGD(classifier_net.parameters(), lr=lr, momentum=0.9)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv

问题5:

使用SGD优化器，并尝试momentum=0, momentum=0.9两种情况, 阐述momentum对loss值以及accuracy值的影响。

动量因子越小，就越可能陷入局部最低点，这样损失值将会很高，而准确率很低。动量因子越大，就越有可能从局部最低点跃出，从而向全局最低点迈进，这样损失值将降低，而准确率将变高。



In [24]:

    
epoches = 100
lr = 1e-3
batch_size = 16
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.SGD(classifier_net.parameters(), lr=lr, momentum=0.0)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv



In [25]:

    
epoches = 100
lr = 1e-3
batch_size = 16
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.SGD(classifier_net.parameters(), lr=lr, momentum=0.9)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv

问题6:

尝试使用Adam，Rprop优化器，观察两种曲线，阐述SGD,Adam,Rprop三种优化器对loss值以及accuracy值的影响.

Adam效率最差，SGD次之，而Rprop优化器的效率最高，可以更快得到更低的损失值和更高的准确率。



In [26]:

    
epoches = 100
lr = 1e-3
batch_size = 16
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.Adam(classifier_net.parameters(), lr=lr)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv



In [27]:

    
epoches = 100
lr = 1e-3
batch_size = 16
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.Rprop(classifier_net.parameters(), lr=lr)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv

问题7(自由发挥):

尝试同时调节以上几种参数，找出你认为最合适的参数（模型收敛得最快），并就此谈谈你的感想.

当学习率为0.01、batch大小为30时，采用Rprop优化器的模型收敛最快，且执行速度也不慢。

在训练过程中，调节好学习率可以加快模型收敛速度，而增大batch的大小可以让模型训练时间缩短，但是会讲低收敛速度，甚至可能影响最终准确率。在三种优化器中，Rprop效果最好，SGD次之，Adam效果最差。



In [28]:

    
epoches = 100
lr = 0.01
batch_size = 30
    
transform = torch.tensor
device = torch.device('cuda:0')
trainset = PointDataset('./labels/train.csv', transform=transform)
trainloader = DataLoader(dataset=trainset, batch_size=batch_size, shuffle=True)
testset = PointDataset('./labels/test.csv', transform=transform)
testloader = DataLoader(dataset=testset, batch_size=batch_size, shuffle=False)
show_original_points()

classifier_net = Network(2, 5, 3).to(device)   
optimizer = optim.Rprop(classifier_net.parameters(), lr=lr)
classifier_net = train(classifier_net, trainloader, testloader, device, lr, optimizer)









    



csv_file source ----> ./labels/train.csv
csv_file source ----> ./labels/test.csv

作业要求:

4. DATA LOADING AND PROCESSING TUTORIAL

（Further content, read it when you are free）

A lot of effort in solving any machine learning problem goes in to preparing the data. PyTorch provides many tools to make data loading easy and hopefully, to make your code more readable. In this tutorial, we will see how to load and preprocess/augment data from a non trivial dataset.

4.1. To run this tutorial, please make sure the following packages are installed:

scikit-image: For image io and transforms
- sudo apt-get install python-numpy
- sudo apt-get install python-scipy
- sudo apt-get install python-matplotlib
- sudo pip install scikit-image
pandas: For easier csv parsing
- sudo apt-get install python-pandas



In [29]:

    
import os
import torch
import pandas as pd
from skimage import io, transform
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils

plt.ion()   # interactive mode

4.2. Let’s quickly read the CSV and get the annotations in an (N, 2) array where N is the number of landmarks.



In [30]:

    
# read a csv file by pandas
landmarks_frame = pd.read_csv('data/faces/face_landmarks.csv')

n = 0
# read image name, image name was saved in column 1.
img_name = landmarks_frame.iloc[n, 0]
# points were saved in columns from 2 to the end
landmarks = landmarks_frame.iloc[n, 1:].as_matrix()
# reshape the formate of points
landmarks = landmarks.astype('float').reshape(-1, 2)

print('Image name: {}'.format(img_name))
print('Landmarks shape: {}'.format(landmarks.shape))
print('First 4 Landmarks: {}'.format(landmarks[:4]))









    



C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:8: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  






    



Image name: 0805personali01.jpg
Landmarks shape: (68, 2)
First 4 Landmarks: [[ 27.  83.]
 [ 27.  98.]
 [ 29. 113.]
 [ 33. 127.]]



In [31]:

    
def show_landmarks(image, landmarks):
    """Show image with landmarks"""
    plt.imshow(image)
    plt.scatter(landmarks[:, 0], landmarks[:, 1], s=10, marker='.', c='r')
    plt.pause(0.001)  # pause a bit so that plots are updated

plt.figure()
show_landmarks(io.imread(os.path.join('data/faces/', img_name)),
               landmarks)
plt.show()



In [32]:

    
class FaceLandmarksDataset(Dataset):
    def __init__(self, csv_file, root_dir, transform=None):
        """
        Args:
            csv_file (string): Path to the csv file with annotations.
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.landmarks_frame = pd.read_csv(csv_file)
        self.root_dir = root_dir
        self.transform = transform

    def __len__(self):
        return len(self.landmarks_frame)

    def __getitem__(self, idx):
        # combine the relative path of images 
        img_name = os.path.join(self.root_dir,
                                self.landmarks_frame.iloc[idx, 0])
        image = io.imread(img_name)
        landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix()
        landmarks = landmarks.astype('float').reshape(-1, 2)
        # save all data we may need during training a network in a dict
        sample = {'image': image, 'landmarks': landmarks}

        if self.transform:
            sample = self.transform(sample)

        return sample

Note(very important):

to define a dataset, first we must to inherit the class torch.utils.data.Dataset. when we write ourselves dataset, it's neccesarry for us to overload the __init__ method, __len__ method, and __getitem__ method. Of course you can define other method as you like.

4.3. Let’s instantiate this class and iterate through the data samples. We will print the sizes of first 4 samples and show their landmarks.



In [33]:

    
face_dataset = FaceLandmarksDataset(csv_file='data/faces/face_landmarks.csv',
                                    root_dir='data/faces/')

fig = plt.figure()

for i in range(len(face_dataset)):
    sample = face_dataset[i]

    print(i, sample['image'].shape, sample['landmarks'].shape)
    
    # create subgraph
    ax = plt.subplot(1, 4, i + 1)
    plt.tight_layout()
    ax.set_title('Sample #{}'.format(i))
    ax.axis('off')
    show_landmarks(**sample)

    if i == 3:
        plt.show()
        break









    



C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:22: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.






    



0 (324, 215, 3) (68, 2)






    












    



1 (500, 333, 3) (68, 2)






    












    



2 (250, 258, 3) (68, 2)






    












    



3 (434, 290, 3) (68, 2)

5. More materials next week you may need